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ABSTRACT 

We introduce a comprehensive analysis of multi-epoch stellar line-of-sight velocities to determine the in- 
trinsic velocity dispersion of the ultrafaint satellites of the MiUcy Way. Our method includes a simultaneous 
Bayesian analysis of both membership probabilities and the contribution of binary orbital motion to the ob- 
served velocity dispersion within a 14-parameter likelihood. We apply our method to the Segue 1 dwarf galaxy 
and conclude that Segue 1 is a dark-matter-dominated galaxy at high probability with an intrinsic velocity 
dispersion of 3.7^j 'J km s"'. The dark matter halo required to produce this dispersion must have an average 
density of pi/2 = 2.5^| gMopc"^ within a sphere that encloses half the galaxy's stellar luminosity. This is the 
highest measured density of dark matter in the Local Group. Our results show that a significant fraction of the 
stars in Segue 1 may be binaries with the most probable mean period close to 10 years, but also consistent with 
the 180 year mean period seen in the solar vicinity at about Icr. Despite this binary population, the possibility 
that Segue 1 is a bound star cluster with the observed velocity dispersion arising from the orbital motion of bi- 
nary stars is disfavored by the multi-epoch stellar velocity data at greater than 99% C.L. Finally, our treatment 
yields a projected (two-dimensional) half-light radius for the stellar profile of Segue 1 of 7? 1/2 
excellent agreement with photometric measurements. 
Subject headings: dark matter — galaxies: dwarf — galaxies: individual: Segue 1 — binaries: spectroscopic 
— techniques: radial velocities — galaxies: kinematics and dynamics 
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1. INTRODUCTION 

The discovery of faint satellites of the Milky Way has been 
revolutionized by the S loan Digital Sky Su r vey (SDSS) data 
dWiUman et all l2005t IZucker et all l2006t iBelokurov etall 
I2007h . These galaxies are much fainter than previously known 
Milky Way satellites, and the inferred velocity dispersions 
range from ~ 3 to 8 km s"' (Q Oevna et al. 2005; Martin et al. 
l2007l;ISimon& Geha 2007 HGeha et al.l l2009). Particulai-ly at 
the lower end of this range, the inferred dispersions are sus- 
ceptible to systematic biases. The most serious of these issues 
are the contribution of binary orbital motions to the veloc- 
ity dispersio n (Olszewski et al. 1996; Harareaves et al. 1996; 
lOdenkirchen et al.l I2002t IMinor et a l. 2010) and contamina- 
tion of dwarf gal axy member samples by Milky Way stars 
dAden et al.ll2009l) . These problems are most critical for the 
ultrafaint satellites with small velocity dispersions because 
the stellar velocity samples are limited in size and contribu- 
tions from binary or nonmember (Milky Way or overlapping 
stream) stars to the measured velocity dispersion may repre- 
sent an appreciable fraction of the galaxy's intrinsic disper- 
sion. Binaries have been the most difficult of these potential 
biases to correct because the properties of binary stars in envi- 
ronments beyond the solar neighborhood are not well known 
and can only be constrained observationally with large num- 
bers of high-precision radial velocity measurements. 

Among the newly discovered ultrafaint dwarf galaxies. 
Segue 1 has received much attention because its prox- 



imity and apparently high mass-to-light ratio make it an 
ideal target for indirect dark-matter-detection experiments 
jGeha et al.' 200l [Martinez et al] l2009l lScottet al.1 l20Tot 
|Essig et al. 2010). However, for the reasons outlined above, 
the inferred intrinsic velocity dispersion may be su sceptible 
to systematic biases (iNiederste-Ostholt et al.ll2009l) . A con- 
fident assessment of these biases requires a larger data set, 
with repeat velocity measurements and an in-depth study of 
membership issues, contamination by streams, and the contri- 
bution to the dispersion from binary orbital motions. In this 
paper, we undertake thi s task using t he spe ctroscopic sam- 
ple of stars presented in ISimon et al.l (120 hereafter Paper 
I), which also contains the main results of our work. In the 
present companion paper, we describe in detail our methodol- 
ogy and the results pertaining to the intrinsic velocity disper- 
sion of Segue 1 . We emphasize, though, that the methodology 
is general and can be applied to any dispersion-supported sys- 
tem such as dwarf spheroidal satellites and globular clusters. 

As a motivation for the methods to be discussed in this pa- 
per, we highlight two crucial issues. The first is related to 
velocity outlier stars. The analysis of small data sets with a 
few tens to ~ 100 stars, typical of ultrafaint dwarfs, is always 
susceptible to large changes due to the inclusion or exclusion 
of certain outliers. For example, in the present Segue 1 sample 
the exclusion of one star (SDSSJ100704.35+160459.4) with 
an intermediate membership probability reduces the maxi- 
mum likelihood velocity dispersion by ~ 30%. A fully 
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Bayesian analysis does not suffer from this drawback, as we 
explicitly show in this paper. 

The second issue is related to repeat measurements with 
variable measurement errors. Among the brightest and best- 
studied stars in the Segue 1 sample, the six red giants and two 
horizontal branch stars, there are at least three radial velocity 
variables. Two of these we identify as RR Lyrae variables, but 
the third appears very likely to be a binary star, and two addi- 
tional giants show some (< 2cr) evidence for velocity changes 
as well. Although the number of stars with multiple high- 
quality velocity measurements is small, the observed variabil- 
ity of the red giant branch (RGB) stars may be larger than 
what would be expected if the binary population were similar 
to that of the Milky Way field. This raises the concern that 
Segue 1 could have a high fraction of binary stars with peri- 
ods short enough (< 10 years) to inflate the observed velocity 
dispersion significantly^ 

A recent study by Minor et al.l (120101) showed that for dwarf 
galaxies with multi-epoch samples of a; 100 or more stars, the 
binary contribution is unlikely to inflate the inferred velocity 
dispersion by more than 30%. They also provide a method to 
correct the velocity dispersion for binaries using multi-epoch 
data. In the case of Segue 1, however, the confirmed mem- 
ber sample is 71 stars (complete down to r = 21.7; Paper I), 
roughly half of which have multi-epoch measurements at the 
present time. Two of these members are RR Lyrae variables, 
which undergo large velocity variations and therefore should 
not be used in the dispersion calculation, leaving 69 members 
for our purposes. Further, the vast majority of the sample is 
made up of main-sequence stars for which the measurement 
errors are quite large, averaging ^ 5.5 km s~\ making the in- 
ferred dispersion less robust. The large errors also compound 
the difficulty of constraining the nature of the binary popu- 
lation, since the non-Gaussian tail in the line-of-sight veloc- 
ity distribution produced by short-period binaries can be ef- 
fectively hidden by large measurement errors. Owing to the 
small multi-epoch sample and the large an d variable measure- 
ment errors, the binary correction given in iMinoretalJdMoh 
cannot be straightforwardly applied t o the Seg ue 1 data set. 

We therefore extend the work of iMinore t al. (2010) and 
consider the full likelihood for multi-epoch velocity mea- 
surements. Along with this extension, we introduce a new 
method to constrain the velocity dispersion of ultrafaint dwarf 
spheroidal galaxies by a comprehensive Bayesian analysis. 
We apply this method to an essentially complete spectro- 
scopic sample of stars within a radius of about 70 pc from 
the center of Segue 1 as described in detail in Paper I, and 
infer the intrinsic dispersion of Segue 1. We find with high 
confidence that Segue 1 has a lar ge intrinsic disper sion (~4 
km s~') as originally estimated bv lGeha et al.l (l2009h . despite 
evidence of its binary population having shorter periods than 
those observed in the solar neighborhood. 

In our method, we model the multi-epoch likelihood of 
foreground Milky Way stars and both binary and non-binary 
stars within Segue 1. This likelihood uses velocity, metal- 
licity, position, and magnitude information to help determine 
membership and binary properties. In contrast to previous 
methods, our calculation does not require determining mem- 
bership probabilities a priori - they are implicit in the cal- 
culation. It has the additional benefit that constraints on the 
galaxy's binary population can be obtained simultaneously 
with the velocity dispersion. Furthermore, by adding more 
parameters, our Bayesian analysis can be easily extended to 
constrain other quantities of interest, e.g., the mass contained 



within a given radius or the galaxy's dark matter annihilation 
signal. 

Our method can also be used to investigate the presence 
of additional populations (e.g., an overlapping stream, or the 
presence of distinct stellar populations in a dSph). Our pre- 
liminary analysis along these lines has not revealed any evi- 
dence for multiple populations in Segue 1, although the data 
also cannot rule out that possibility. In addition, allowing 
for the possibility that the stellar velocities are drawn from 
a dwarf spheroidal plus a separate stream-like population has 
no significant effect on the inferred intrinsic dispersion. 

In Section |2] we will derive a likelihood for both member 
stars and foreground Milky Way stars. In Section |3] we will 
derive a multi-epoch likelihood for binary stars and show how 
this can be generated by a Monte Carlo simulation. In Section 
13.2! we discuss our priors on the binary population and how 
they affect the derived binary constraints of Segue 1 . The in- 
ferred velocity dispersion using this method is given in Sec- 
tionlH and the constraints on Segue I's binary population are 
discussed in Section|5] In Section|6]we discuss the possibility 
of contamination by the Sagittarius tidal stream, and conclude 
in Section [T] 

2. BAYESIAN METHOD: INCORPORATING 
MEMBERSHIP 

Membership determination is crucial in estimating Segue 
1 's dynamical properties because the inclusion or exclusion of 
stars from the proposed Segue 1 sample may drastically affect 
the derived constraints. The most striking example is the star 
SDSSJ100704.35-I-160459.4, which is a 6cr velocity outHer 
but has a relatively high probability of membership due to its 
close proximity to the projected center of Segue 1 . If this star 
is assumed to be a member, the inferred maximum-likelihood 
velocity dispersion increases from cr x 4.0 km s"' to cr a: 5.5 
km s"' (Paper I). 

The most sophisticated and statistically correct method of 
membership determinatio n described so far is th e expectation 
maximization algorithm of l Walker et al.l(l2009bh . with the pri- 
mary aim of determining membership probabilities for stars. 
We extend this method in two essential ways — to allow for pa- 
rameter space exploration and parameter estimation. As with 
the Walker et al. method (illustrated in Figure [U, we do so 
by modeling both the Milky Way and Segue 1 and simultane- 
ously constraining the model parameters using the complete 
data set. Since the membership probabilities are naturally in- 
corporated into the analysis, this approach obviates the need 
to directly evaluate the membership of individual stars. 

Let us suppose that in the (largely) magnitude-limited 
Segue 1 sample, a fraction F of the stars are members. To 
eliminate obvious nonmember stars, a color-magnitude cut 
around the best-fit isochrone is made; spectroscopic measure- 
ments are then obtained for the remaining stars. For simplic- 
ity, we start with the assumption that each star has a single 
velocity measurement v and reduced equivalent width (EW) 
of Ca 11 triplet absorption lines w (metallicity indicator, see 
Paper I). If multiple measurements are made, the velocities 
and metallicities in the following formulas can be replaced by 
their average values over multiple epochs, suitably weighted 
by the measurement errors as described further in Section [3] 
(see Equations (fl4l l and (fTsll). For each star we define R to 
be its projected radius from the c enter of Segue 1 . T he center 
obtained from SDSS photometry dMartin et al.ll2008h is offset 
by about 32" from the mean stellar position of our spectro- 
scopic sample within 10'. However, we ran our full analy- 
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Fig. 1 . — Distributions of the complete Segue 1 data set in line-of-sight velocity (left) and reduced calcium triplet equivalent width, a proxy for metallicity 
(right). We infer the velocity dispersion of Segue 1 by fitting the combined probability distribution function (dotted magenta line), composed of both the Milky 
Way (dashed blue line) and Segue 1 (dash-dotted green line) distributions, to the complete data set (solid black line); this eliminates the requirement to determine 
the membership of each star a priori. The above graphs illustrate a high likelihood parameter set that describes the data well. These parameters are marginalized 
over to obtain probability density functions of relevant model parameters (e.g., dispersion and half-light radius). 



sis by changing the center from the SDSS photometry to the 
mean stellar position of our sample and found that this had lit- 
tle effect on the posterior of the intrinsic velocity dispersion. 
Therefore, for the rest of the analysis, we fix the center to the 
SDSS photometry value. Assuming there are only two stellar 
populations, the Milky Way (MW) and Segue 1 (gal) galaxies, 
the joint likelihood for a single data point f^,- - { v, e, w, e„, 7?),- 
is 

t(MJ^) = Fi:g,i(f^,i^g,i) + (1 - F)tM^m.£^^\ (1) 

Here, £,^^\ and -Cmw the individual probability distribu- 
tions of Segue 1 and the Milky Way parameterized by the sets 
./#gai.Mw- AH sources of measurement error in v and w are 
included in e and e^,, respectively, and we model the mea- 
surements as being drawn from a Gaussian distribution with 
these errors. The metallicity distributions of the member and 
nonmember stars are each modeled by Gaussians with mean 
metallicities vvgai, m>mw and widths cr„, g^j, cr„, respectively. 
We assume that metallicity has no spatial or velocity depen- 
dence because no metallicity gradients have been detected in 
any of the ultrafaint dwarfs. The likelihood is assumed to be 
separable in velocity, position, and metallicity, so that each 
individual probability distribution can now be written as 

Xgal,Mw(t^% K) - Xgal,Mw(w)Xgal,Mw(t^'l^)-Cgal,Mw(^), (2) 

where 



•gal.MW 



exp 



(W - Wgal,Mw)^ 



" w.gafMW 



(3) 



We have momentarily dropped the model parameter notation 
^ for clarity. The last factor in Equation (|2|l has a simple 
physical interpretation: the spatial probability distribution is 
the projected number (surface) density of stars normalized to 
unity. Note however that this surface density is the density of 
observed stars, which may heavily be influenced by selection 
biases. Thus, we write the observed spatial probability den- 
sity as X.(R) - n{R)S{x,y)/N, where n{R) is the actual surface 
density of the member stars, is the total number of stars 



in the sample, and S{x, y) represents any bias introduced by 
observational selection. In the classical dSphs, which contain 
hundreds to thousands of bright member stars, the selection 
function may be difficult to quantify, but in the much sparser 
ultrafaints it is ofte n more straightforward to model the spec - 
troscopic selection (IWillman et al.ll20Tot ISimon et alJl2010f) . 
To avoid spatial selection biases, we use the conditional like- 
lihood X.(v,w\R) - £.{v,w,R)l £.{R). From the previous dis- 
cussion, we have 



X(V, W\R) = /(^)Xgal(w)Xgal(v|-R) 

+ (1 -/(/?)) Xmw(w)Xmw(v|/?) 



(4) 



where f{R) is the fraction of stars that are dwarf galaxy mem- 
bers at the position R: 



fiR)^ 



n.^\(R) 



ngaiiR) + riMwiR) 



(5) 



In principle, the selection bias affects the Milky Way and dSph 
distributions equally, so that by Equation (|5]l the membership 
fraction f(R) should be insensitive to these selection biases. 
Put another way, the spatial selection bias only affects the total 
number of stars selected and not the fraction of those stars that 
are members. 

The Segue 1 data set is fairly unique in that, within the 
given color, magnitude, and spatial cuts, the sample is es- 
sentially complete up to its magnitude limit of r = 21.7, al- 
though it does also extend to somewhat fainter magnitudes 
and larger radii (Paper I). Thus, spatial selection biases are 
not expected to be significant and we may also use the full 
likelihood, X.{v, w, R), incorporating the spatial dependence 
directly. The conditional likelihood (Equation (|4]l) is better 
suited for situations where the spectroscopic data set is not 
complete, which is more typical. We find that the inferred ve- 
locity dispersion of Segue 1 is insensitive to whether we use 
the full likelihood or the conditional likelihood. However, as 
shown below, the full likelihood does provide a tighter con- 
straint on the stellar distribution itself. The results obtained 
from these two methods are compared in Table|2] We also list 
the priors used for each parameter in that table. In this paper 
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we use the conditional likelihood (Equation (|4|) by default 
unless the positional information becomes important. This is 
the case for the half-light radius and the inferred dark matter 
density with the half light radius, for which we quote results 
obtained from the full likelihood (Equation (|2|). 

The projected number density of the dSph stars is modeled 
by a modified Plummer profile of the form 

n^AR) (l + (R/Rsf) , (6) 

where a = 5 is the standard Plummer profile integrated along 
the line of sight. Using the conditional likelihood, the data 
are not able to constrain the outer slope (a). But the full like- 
lihood analysis does provide a modest constraint, a = 4.1^^ 
which is consistent with a Plummer profile (see Table|2l). The 
number density of Milky Way stars is assumed to be spatially 
constant over the field of view, which should be a reasonable 
approximation for compact systems such as Segue 1 . The nor- 
malization of the Milky Way likelihood in R, which we call 
"Mw.o, is thus determined solely by the cutoff radius, which 
we take to be that of the star farthest from the center of the 
galaxy. For determining membership, however, only the rela- 
tive normalization between the dSph and Milky Way number 
densities is important in Equation (|5]i; this is given by 

A^gd = . (7) 

"MW.O 

We therefore include A^gai as a model parameter 

Neglecting binaries, the velocity distribution of Segue 1 is 
assumed to be Gaussian with dispersion cr and mean veloc- 
ity yU. Although in principle any velocity distribution can be 
used, there is currently no evidence for large de viations from 
Gaussianity in dSph velocity distributions (e.g. IWalker et al.l 
|2006). In Section |3] we discuss how this velocity distribu- 
tion is modified by the presence of binary stars. For the ve- 
locity l ikelihood of Milk y Way stars, we use the Besancon 
model (iRobin et al.l2003h together with the appropriate color- 
magnitude cuts. However, to allow for uncertainties in the Be- 
sancon model, we allow the velocity distribution to be shifted 
by a small amount S and stretched by a factor S, both are 
shown to be well determined by the data. We also explored 
the potential effects of assuming other foreground models — a 
"noisy" Besancon model and a Gaussian fit whose peak is off- 
set by about 50 km s"' — but found no significant effect on the 
inferred posterior for the intrinsic dispersion. We therefore do 
not discuss these alternate foreground models further. 
Our resulting set of 14 model parameters is 

- {NgM,0-,H,W,0-n.,WMW,0-H%MW,Rs,S,S,a}. (8) 

The probability density of the model parameters ./# given 
the data sets W - {w,), ^ - {v,), and ^ - {Rj} can now be 
written as 

9{J^\W, rM) oc £(#-, n^, J()V( J(\ (9) 

where L{W , 'f\Sf., J() = fl/ X(w,-, v,|7?,-, Ji) is the likelihood 
function for the complete data set and !P(./#) is the prior on 
the model parameters. We choose uniform priors in the above 
parameters with the exception of the metallicity distribution 
widths, (Tj,, and cr^^^, for which we choose the usual non- 
informative priors that are uniform in log-space. To conser- 
vatively bias our member probabilities (and consequently the 
dispersion) low, we choose the A^gai and Rs priors also to be 



uniform in log-space; however, we found the form of the pri- 
ors in these parameters to have little effect on the inferred dis- 
persion. The prior on velocity dispersion was chosen to be 
uniform since this is the parameter of interest. 

After estimating the model parameters ./#, we can derive 
membership probabilities for each individual star The for- 
mula for the probability of membership for the /th star is 

^ /(/^/)Xgal(W/,V/|/^/) 

/(/?,)Aal(W/,V,|/?,) + (l-/(-R,))i:Mw(vV,,V,|/?,y 

Because we derive a probability distribution in the model pa- 
rameters the probability distribution for p, can be ob- 
tained using our method. Here, we will quote the average 
membership probability {pi). 

3. BAYESIAN METHOD: CORRECTING FOR BINARIES 

Apart from contamination by nonmember stars, the ob- 
served velocity dispersion of Segue 1 may also be inflated by 
binary orbital motion. One metho d of correcting the disper- 
sion for binary motion is given in iMinor et alj (120101) . This 
method requires measuring the threshold fraction of the sam- 
ple, defined as the fraction of stars with observed change in 
velocity greater than a certain threshold after a time inter- 
val (typically 1 year). Provided that velocity outlier stars 
are discarded when determining the dispersion (e.g., by a 3cr 
clip), the threshold fraction F is tightly correlated with the 
dispersion introduced by binaries. This relation can be used 
to correct the dispersion for binaries. Although the thresh- 
old fraction is defined in terms of two epochs, it can be better 
determined using more than two epochs with a likelihood ap- 
proach. This approach also has the advantage that it uses only 
velocity changes to characterize the binary population, and 
hence is less affected by contamination by nonmember stars 
than if the velocities were used directly. 

Unfortunately, this method is not ideal for the present 
data set of ultrafaint galaxies like Segue 1 for several rea- 
sons. First, the majority of the sample consists of faint main- 
sequence stars (and not red giants) for which the measure- 
ment errors are considerable (of the same order as the dis- 
persion itself). Given this and the present sample size for 
Segue 1 (65 stars with multi-epoch measurements, roughly 
half of which are members), the threshold fraction is not well 
determined. Second, the relation between threshold fraction 
and dispersion is a result of the degeneracy of binary frac- 
tion with other properties characterizing the binary population 
(e.g., mean period). However, this degeneracy is weaker for 
main-sequence stars than f or red giants , so tha t the uncertainty 
in the binary correction of iMinor et al.l (1201 Oh becomes wider 
by a factor of two, though it is mainly at the small disper- 
sion end. Third, this method only corrects the dispersion by 
an amount that is the same for each star, whereas individual 
stars with large observed velocity changes should in principle 
receive a lar ger correction . 

While the IMinor et alj (12010 ) method can still be applied, 
we adopt a more ambitious approach: modeling the multi- 
epoch likelihood of binary stars and incorporating it into a 
comprehensive Bayesian analysis. In this approach, we in- 
clude as model parameters the binary fraction B, mean period 
yUiogf, and width of the period distribution cr\agp. Since the 
individual velocities are used, in order to distinguish between 
binaries and nonmember stars we will also need to model the 
likelihood of nonmember stars as in the previous section. In 
principle this is the best possible method for determining the 



Velocity Distribution Derived Value Derived Value 

Parameters Priors Assumed^ Conditional Likelihood' Full Likelihood^ Description 

o" km s"' < cr < 10 km s"' 3.7^[ | km s"' 3.5^[ q km s"' Intrinsic velocity dispersion of Segue 1 

H 200 km s"' < ;u < 220 km s"' 209+[ km s"' 209^J km s"' Systemic velocity of Segue 1 

vvgai 2A<vv<6A 3.1+°-^ A 3.2+°-^ A Segue 1 average reduced Ca EW (Equation O)) 

cr„,_g,i -2 < logio(cr„[A]) < 1 0-05 !oo^ 0.03!°°^ Segue 1 reduced Ca EW dispersion (Equation ©) 

vvmw 2A<vv<6A 4.0+°- [a 4.0+°[ A MW average reduced Ca EW width (Equation Q) 

cr„.Mw -2 < logio(cr„[A]) < 1 0-06!oo4 0.05 reduced Ca EW dispersion (Equation (O) 

6 -70 km s"' < (5 < 10 km s"' -19!g km s"' -20+g km s"' Shift in the MW velocity distribution 

S -2 < logio(5 ) < 1 0-03!oo5 0-01-005 Scale in the MW velocity distribution 



Stellar Profile Derived Value Derived Value 

Parameters Priors Assumed' Conditional Likelihood' Full Likelihood^ Description 

1 < logio(/?j[pc]) < 2 1.8+[J-^ 1.4+^-2 Scale radius (Equation ^) 

a 3<a<10 * Outer log slope (Equation I©) 

A^gai -1 < logio(A^gai) < 3 0-5!o2 1-0-02 S&gVi& 1 Central density / MW density (Equation d?)) 



Derived Value Derived Value 

Binary Parameters Priors Assumed' Conditional Likelihood' Full Likelihood^ Description 

B Q < B < \ * * Binary fraction 

criogi^(P) 0.5 < (T\o^^^(p) < 2.3 * * Dispersion of the orbital period distribution 

/iiogiij(/)) MW composite prior (see the text) ^■'^-{'2 ^-^-l o Mean of the orbital period distribution 

* Value not constrained. 
Unless otherwise stated, the prior is assumed to be flat within the given range. 
"Using the conditional likelihood £.("f^, given by Equation @ 

''Using the likelihood £.{ '/^, S) given by Equation (2) where we include the spatial information directly. 
Note that the results from 1 and 2 are quantitatively similar. The differences arise due to the fact that the half-light radius (which is determined by both Sj and a) is better constrained by using the full likelihood. 
Except for the constraints on the half-Ught radius and the dark matter density within the half-light radius, our final results are based on the (more conservative) conditional likelihood. 
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intrinsic dispersion of a dwarf galaxy or cluster, since it uses 
all the available information to constrain properties of the bi- 
nary, member, and nonmember populations in a consistent 
way. 



measurement error, Q 



(14) 



3.1. Multi-epoch likelihood 

In order to correct the velocity dispersion of dwarf 
spheroidal galaxies for binaries, we must extend the Bayesian 
method developed in Section 2 to include the effect of binary 
stars. First we neglect the Milky Way component and focus 
on the dwarf galaxy likelihood, for which the dynamical pa- 
rameters are the velocity dispersion (cr) and systemic velocity 
(fj). We take as a model parameter the fraction (B) of the stars 
in binary systems, and we further model the binary population 
by a set of parameters ^ that characterize the distributions of 
binary properties. In general, these binary properties may in- 
clude the periods, mass ratios and orbital eccentricities. The 
distributions of these properties and our choi ce of model pa- 
rameters will be discussed in detail in Section [J!2] 

Suppose a star of absolute magnitude M has a set of n ve- 
locity measurements {v,) = {vi, . . . , v„) and errors {e,) taken 
at the corresponding dates {f,). For readability, when denoting 
probability distributions we will suppress the brackets denot- 
ing sets of measurements (e.g., P({v,)) — > P(v,)). For rea- 
sons that will become clear later, we will write the likelihood 
of each star in terms of a joint probability distribution in the 
measured velocities v, and Vcm, the velocity of the star sys- 
tem's center of mass (which is unknown), and then integrate 
over Vcm- The likelihood can be written as 

£(vi\ei,ti,M;(r,i^,B, ^) 

Pivi, Vcmki, ti, M; o-,n, B, ^)dv^^ 

CO 



P(vi\vcm, ei, ti, M; B, ^)P(v,Jcr,fi)dv,^. (11) 



The second factor in the integrand is the probability distribu- 
tion of the center-of-mass velocity of the stars, which we take 
to be Gaussian: 



P{vcm\o-,iJ.) = 



(12) 



The first factor in the integrand of Equation ( fTTT i is the prob- 
ability of drawing a set of velocity measurements {v,) given 
a star with center-of-mass velocity Vcm. This probability dis- 
tribution is determined by two factors, binarity and measure- 
ment error It can be written as follows: 



P(vi\v,^,ei,ti,M;B, ^) 



=(i-B)n 



1=1 



+ BPi,(v!\v,^,e!,ti,M; 



= {l-B)N{vi,ed- 



+ BP;(y,-VcM|e/,f/,M;^) 



(13) 



where P^(v, - Vcmk,, M; 3^) is the likelihood in the center- 
of-mass frame of the binary system, with the velocity in the 
center-of-mass frame given by vj = v, - Vcm. In the first term, 
(v) and em are the weighted average velocity and equivalent 



" 1 



(15) 



V i=l /■ / 

while the normalizing factor N is given by 
V2^ 



yV(v/,e,) 



X exp 



n;ii ^ 

_ !_ y (v, -v;)2 

4 l-i 



4 ^ ^ 10 



(16) 



The last term in the denominator of the exponent is implicitly 
zero when n = 2. 

Multiplying Equation ( fT3] l by Equation ( fT2l i and integrating 
in accordance with Equation ( fTTT i. we find: 



Uy\ei,ti,M\a-,iJi,B,3^) 

oc(l-B)- 



+ BJ(o-,fi, 



V27r(cr2+4) 
where we have left off the normalizing N factor, and 



(17) 



y(cr,jU, ^) 



!^(vc, 

oo 



0- 



V2^ 



rdVc 



P'l,(Vi - V'cmk/, ti, M; , 

Nivi,ei) 



(18) 



(19) 



Since the factor TV is independent of all model parameters, it 
is usually ignored in the likelihood when averaging velocities 
without regard for binaries (i.e., it acts only as a normaliz- 
ing factor). As Equation (fT9T l shows, however, it is crucial to 
include here since it determines the relative normalization of 
the binary and non-binary terms. Note that if a star exhibits 
large velocity variations compared to the measurement errors, 
according to Equation ( fT6l ) the N factor will be quite small. 
If in addition the velocity variations are observed over some 
time interval consistent with binary behavior, the normaliza- 
tion of 'R{vcm) will be greatly enhanced, possibly by orders 
of magnitude, because of the N factor in the denominator of 
Equation ( fT9] ). 

For each star that has multi-epoch data, we run a Monte 
Carlo simulation and bin the velocities over a table of v^m val- 
ues to find P^(v, - Vcmk,, ti, M). The '??-function is recorded for 
each star and subsequently integrated to evaluate J(a-,ii, ^). 

' The data e, in Equations )14t and )15t include all the sources of error 
identified in Simon & Geha (2007). These errors could have a systematic 
component (as suggested by Simon & Geha 2007) that does not average out 
statistically as we have assumed here. A greater number of repeat indepen- 
dent velocity measurements would be required to test for this scenario. For 
consistency, the Calcium triplet EWs are averaged in exactly the same manner 
as the velocity measurements. 
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3.2. Binary population model uncertainties 

To infer the intrinsic velocity dispersion of Segue 1, we 
must marginalize over the parameters characterizing the bi- 
nary population. It is therefore critical to address the question 
of which binary model parameters to use and how to deal with 
uncertainties in these parameters. Besides the binary fraction, 
a population of binary stars can be described by distributions 
in three parameters: the mass ratio q, eccentricity e, and or- 
bital period P. In the absence of a large number of epochs, 
eccentricities are difficult to constrain because very eccentric 
binaries spend a relatively small amount of time near their 
perihelion where the observed velocities are large. We there- 
fore fix the distribution of eccentricities and assume the form 
given in Minor et al. (2010), which is similar to that observed 
in solar neighborhood field binaries. 

Along similar lines, velocity measurements at several 
epochs are usually needed to determine the mass ratio of a 
binary independently of its orbital period. Evidence sug- 
gests, however, that the period distributions of different bi- 
nary populations can differ drastically, while the distribu- 
tion of mass ratios may have a more nearly universal form. 
This is certainly true for long-period binaries, for which the 
mass ratio follows the Salpeter initial mass function for q ^ 
0.5 (assuming the primary mass to lie in a very restricted 
range^^ as is the case for the o bserved sample in Segue 1 ; 
cf. iDuquennov & Mayor [l99l|). The observed distribution 
of mass ratios for short- period binaries (P < 1000 days) i s 
closer to uniform (Goldberg etal.1 120031 iMazeh et all 1 19921) . 
and at present it is unclear whether this form is universal in 
primordial binary populations. We therefore fix the mass ratio 
distribution and assume it to follow a form similar to that ob- 
served in the solar neighborhood, as described in Min or et aTl 
J2OIO), with a uniform distribution for short-period binaries. 
Note that we are allowing for the mass ratio and ellipticity to 
vary from star to star — it is just the form of the distribution 
from which these parameters are derived that is fixed. In prin- 
ciple, we could also vary the functional form of q and e, but 
this is computationally expensive. The main reason is that the 
function fiivcm, ^) (see Equation (fT9] l) will have to be com- 
puted on a grid that includes the parameters used to describe 
the functional form of q and e distributions. In addition, given 
the small data set and the large measurement errors, these pa- 
rameters will be highly degenerate with other binary parame- 
ters (B,jJ]ogP, o-iogp)- 

Although binary populations in open clusters have been 
observed to display a narrower distribution of periods than 
binaries in the field dBrandner & Koehled[l998l; IScally et al.1 
[T999I) . they still range over multiple decades of period. 
For simplicity we assume the period distribution of Segue 
1 to have a log-normal form, in analogy to field bi- 
naries (Duquennoy & Mayor 1991; Fis cher & Marcyl fl992l: 
iMayor et al..,1992: Raghavan et al,;.2010i). while the mean pe- 
riod fiiogp and spread of periods criogp will be allowed to dif- 
fer from that observed in solar neighborhood field binaries. 
We therefore have three binary parameters that are allowed to 
vary: the binary fraction B, the mean log-period fJ-iogp, and 
log-spread of periods o-\ogp. 

Since the binary fraction B may have any value between 
and 1, we choose a uniform prior in B over this interval. Our 
prior in the period distribution parameters jjjogp and o-\ogp, 
however, requires more careful consideration. The prevailing 
viewpoint is that the observed distribution of field binary stars 
in the solar neighborhood is a superposition of populations 
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Fig. 2. — Period distributions of simulated clusters generated from our pri- 
ors, compared to the observed period distribution of field binaries in the solar 
neighborhood with solar-mass primaries iDuauennov & Mayor 1991). The 
simulated distributions have widths drawn from a flat prior with a range 
<^losP s [0.5,2.3], and mean periods drawn from a Gaussian prior chosen 
suc h that when many clus t er per iod distributions are superimposed, they form 
the lDuquennov & Mavoil <1991D period distribution of field binaries. 

from a wide variety of star-forming environments with differ- 
ent period distributions; this is supported by the fact that sev- 
eral clusters seem to exhibit period distri butions that are more 
peaked than those observed in the field (iBrandner & Koehle^ 
11998, Scally et al. 1999). We shall therefore make the as- 
sumption that the binary populations in dwarf spheroidal 
galaxies follow period distributions that are a subset of the 
distribution exhibited by solar neighborhood field binaries. 
Given the fact that the period distribution in the solar neigh- 
borhood is nearly flat in log-space over the relevant parameter 
space, one option is to use flat priors in fJ-iogp and o-\ogp. How- 
ever, the limits of integration are somewhat arbitrary, and may 
allow populations with binary periods shorter or longer than 
any observed in the solar neighborhood. A somewhat better- 
motivated method is to assume a flat prior in criogp with a 
certain range [criogp,^;,,, 2.3], and then find a prior distribution 
in the mean period such that when many binary populations 
are drawn from these prio rs, they superimpose to form the 
IDuquennov & Mavoii (1199 1) period distribution observed in 
field binaries. This is illustrated in Figure |2] where we plot 
a few cluster period distributions which have been generated 
from this pr ior, together with the fiel d binary period distri- 
bution of Duquennoy & Mayoij (1199 Ih observed in the solar 
neighborhood that has parameters n\ogp - 2.23 {P in years) 
and (Tiogp = 2.3. We assume a Gaussian prior for fiiogp with 
a mean fiiogp = 2.23, then maximize a likelihood to find the 
width o-ft of this prior required to reproduce the field binary 
distribution when a large number of populations are superim- 
posed. If we choose a minimum period spread criog/J min - 0.5, 
we find that the width of the mean period prior satisfying these 
conditions is o"^ = 1.7. While this prior already encapsulates 
a very wide range of period distributions, we also investigate 
more extreme priors and show that our inferred velocity dis- 
tribution is not significantly affected by our priors. 

3.3. Test of the multi-epoch binary-correction method 

Before writing down our final complete likelihoods for 
dSph and Milky Way stars, we provide a test of the effective- 
ness of our binary correction method in the presence of fore- 
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Fig. 3. — Inferred probability distributions of the intrinsic velocity dispersion (panels a and c) and mean binary period (panels b and d) for simulated Segue 1-like galaxies, using 
our method of modeling the binary population (solid) compared to simply clipping 3(T velocity outliers (dashed) and then computing the dispersion. Each simulated galaxy uses the 
same number of epochs, dates, velocity eiTors, and magnitudes as the actual Segue 1 sample. This includes foreground stars from the Besancon model for Milky Way as outlined 
m Section [33] We plot one of the realizations that has a maximum hkehhood velocity dispersion close to 4 km s ' after discarding 3(T outliers iteratively. For the actual intrinsic 
dispersion, we choose two cases: 0.4 km s"^ (top panels — a and b) and 3.7 km s"' (bottom panels — c and d), which is our inferred most probable dispersion of the actual Segue 1 data 
set. The binaiy population has a mean period of 10 years, binary fraction of 0.7, and period distribution width triogp = 1.5 — consistent with our final results for the period distribution 
and binary fraction of Segue 1 stars. To infer the binary -conected dispersion, we marginalize over the systemic velocity, binary fraction, mean period, foreground parameters (S and 
S), and total fraction, whereas for the non-binary con'ected dispersion we marginalize only over the systemic velocity in addition to iteratively discarding 3it velocity outlier stars. It 
is clear that our method is able to conectly infer the intrinsic dispersion and extract the mean period of the binaries (indicated by vertical dotted lines in each panel). As a check on 
the robustness of our methodology, we simulated a mock data set with orbital periods drawn from a distribution flat in logarithm of period (cyan dot-dashed lines). Employing the 
same analysis method and assumptions used previously, the intrinsic dispersion was recovered faithfully for both the 0.4 km s~^ and 3.7 km s~' cases. Other realizations show similar 
behavior, see Section [33] for more details. 



9 




0.5 1 1.5 2 

Av/ffge (km/s) 

Fig. 4. — Plotted is the fraction of stars with velocity changes (between 
any two epochs) greater than a certain threshold defined in units of the 
crly^ = cr-^+ trl^, where cT{ and ct2 are the en'ors of the first and second epoch 
measurements. The solid magenta curve is the Gaussian expectation, i.e., no 
contribution from binary orbital motion. Simulations with no binaries (not 
shown here to retain simplicity of presentation) straddle the Gaussian expec- 
tation and do not show systematic positive deviations as large as the data. 
This is therefore a simple way to deduce the presence of binaries and it is 
the reason why we are able to get a handle on some of the binary properties. 
We have also plotted the simulations shown in Figure[3]as the dotted red and 
green curves, which show similar behavior as the data and that is the reason 
why we are able to constrain their (statistical) binary properties and recover 
the intrinsic dispersion. 

ground stars, which is summarized in Figure [3] We generate 
a series of mock observations of three Segue 1-like galaxies. 
In one set of realizations (illustrated in the upper two panels) 
we assume that the underlying velocity dispersion is cr = 0.4 
km s"'. Note that because of the extremely low luminosity 
of Segue 1 (L ^ 4OOL0) even a velocity dispersion as low as 
0.4 km s"' would i mply dark matter, with M1/2/L1/2 - 18, 
using the formula of I Wolf et alj (l2010h . In the other two real- 
izations (lower panels), we assume that the intrinsic velocity 
dispersion is close to what we infer in the next section from 
the actual dataset, cr - 3.7 km s"'. We assume that the galax- 
ies have binary star populations of the type that could conceiv- 
ably hinder our ability to infer an intrinsic velocity dispersion, 
with B - 0.7, a mean period of P = 10 yr, a period spread of 
o"iogP = 1-5, ellipticity and mass fraction distributions as de- 
scribed in the earlier section. We also include a distribution of 
foreground stars drawn from a Besancon model displaced 209 
km s"' in mean velocity (see Figure[Tll from the mock Segue 1 
galaxy. In the simulations with cr = 3.7 km s"', we also con- 
sider a case with a 180 year mean period, which is consistent 
with the solar neighborhood value. In addition, we simulated 
a set of mock observations with periods drawn from a distri- 
bution flat in logP for both the 0.4 km s"' and 3.7 km s"' 
cases. We analyze this set in the same way as others by fitting 
it with a log-normal distribution in period. Each mock galaxy 
consists of 69 member stars and they are "observed" once or 
multiple times in exact correspondence with our real Segue 
1 member sample (Paper I, Section 3.1). Each mock data 
set also contains 109 nonmember foreground stars randomly 
selected from the Besancon model. For each star, velocities 
are generated using the measurement errors, dates, and mag- 
nitudes from the measured stars in the Segue 1 sample. In 
each panel, the dashed curve shows the result of inferring the 
intrinsic dispersion based on the common procedure of dis- 



carding 3o" outliers iteratively from the member star sample 
(labeled as 3cr clipping). The solid lines in each panel show 
the probability distributions of intrinsic dispersions using our 
Bayesian analysis of multi-epoch data described in the pre- 
vious sections. The dot-dashed curves show the probability 
distributions resulting from our full Bayesian analysis on a 
mock data set simulated with a flat log P distribution. 

The top panel of Figure [3] illustrates that even when binary 
orbital motion accounts for most of the observed dispersion 
in the presence of significant foreground contamination, our 
method is able to extract the intrinsic dispersion faithfully. 
Moreover, we are able to recover the mean binary period, 
including cases where the intrinsic dispersion is fairly high 
(lower panel) and even when P = 180 yr Although peri- 
ods longer than a few years are not directly observable in the 
time-frame of our mock observations (1-2 years), our method 
extrapolates from the period distribution at shorter periods un- 
der the assumption of a log-normal period distribution. There- 
fore, the mean period can still be inferred in this case, inas- 
much as the assumption of a log-normal period distribution 
holds. More impressively, our method is able to recover the 
correct intrinsic dispersions even when the underlying period 
distribution is different from log-normal. 

Though only typical simulation results are shown in Fig- 
ure [3l we have applied our method to several mock galax- 
ies within each category described above. In the case of a 
0.4 km s ' intrinsic dispersion, our inferred dispersions are 
consistently more accurate than the 3cr clipping result, and 
each distribution exhibits only a small probability of having 
a dispersion greater than ~3 km s"'. Furthermore, the most 
probab le in ferred dispersion corresponding to the peaks in 
Figure [3(a)| is smaller than 2 km s"' for every simulation we 
ran. These results show that binaries account for most of the 
observed dispersion in this set of simulations, and that our 
method is able to extract the intrinsic dispersion fait hfully . 
In the case of 3.7 km s"' intrinsic dispersion (Figure [3(c) 1, 
while some of the posteriors do have tails going to zero dis- 
persion, the probability in the region between 0-1 km s ' dis- 
persions is quite small, confirming that binaries are unlikely 
to account for most of the 4 km s"' observed dispersion. In 
all cases where the periods in the mock data were drawn from 
a log-normal period distribution, we are also able to recover 
the mean bin ary pe riod to within about Icr (as the example in 
figs. [3(b)| and [3(d)| illusti-ates). 

It is worth noting that there is a small possibility of failure 
in extracting the correct intrinsic dispersion and mean period. 
This failure can arise in the following way. If the simulated 
data set has a few outliers in velocity changes but the data 
are otherwise roughly consistent with the measurement errors, 
then the method will try to fit the outliers with a mean pe- 
riod smaller than the correct one. However, the small number 
of significant velocity changes will force the binary fraction 
to be small. The shorter mean period will force the intrinsic 
dispersion to be smaller (than the true dispersion), while the 
smaller binary fraction will reduce the contribution of binary 
orbital motion and hence increase the inferred intrinsic dis- 
persion. These effects go in opposite directions, but the net 
effect could be to underestimate the binary correction. To test 
for this possibility, we plot in Figure H] the fraction of veloc- 
ity changes greater than a threshold and compare to the ex- 
pectation from just measurement errors. The data (shown in 
dashed black) exceed the Gaussian (measurement error) ex- 
pectation and hence provide visceral proof of the presence of 
binary stars. There is a systematic positive deviation that is 
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not consistent with changes introduced by purely measure- 
ment errors. These deviations allow us to deduce the effect 
of the binary orbital motion on the measured dispersion. The 
fact that our data set does not show signatures of just few out- 
liers (i.e., it shows a systematic positive deviation from the 
Gaussian expectation) also assures us that it is not prone to 
the failure mode described above. 

These simulation results provide ample reasons for confi- 
dence in our technique. We will proceed to apply this tech- 
nique to the real Segue 1 data set in Section |4] Before doing 
so, we add the final piece to our likelihood, which will allow 
for a self-consistent Bayesian treatment of both binary orbital 
motion and membership probabilities. 

3.4. Likelihood for dwarf spheroidal and Milky Way stars 

Suppose that over a certain region of the sky containing the 
dwarf galaxy sample, a fraction F of the stars are members. 
We can then write the likelihood in terms of F as 



£ivi\ei,ti,M;F,B,o-,fi) = (1 
+ F£c(vi\ei,ti,M;B,o-,ij), 



F)£Mw(vi\ei) 



(20) 



where Xc is the galaxy likelihood given by Equation (fTTI i and 
Xmw is the Milky Way likelihood. Since the majority of non- 
member stars were not singled out for repeat measurements, 
we do not directly model the binary population of the Milky 
Way; instead, the effect of binaries and other uncertainties 
in the Milky Way velocity distribution are accounted for by 
the translation and stretch parameters 6 and S (as discussed 
above in Equation (7)). For compactness, we have suppressed 
the S and S dependence in Equation ( |20l l and have also sup- 
pressed the metallicity and position components of the likeli- 
hood, which are, however, still included in our marginaliza- 



tion. 



The Milky Way likelihood -£mw can be written as 



(21) 



where PMw(t^'cm) is obtained from the Besancon model, and 
the distribution Pwiwivilvcm, e,) is equivalent to that of Equa- 
tion (T3[ with B = 0. Plugging in Equations (fT3] l and (fTTI l. we 
arrive at 



£(vi\ei,ti,M;FB,(r,i2) cx (1 - F)£MW«v)|e,„) 



+ F 



(l-B)- 



yjlTTicr^ + el) 



+ BJ((T,n) 



(22) 



Again, we have absorbed the normalizing factor N into the 
definition of Jicr, jj.) (given in Equation (fTSl l) and 



XMW«v)km) 



V2^ 



c: 



(23) 



Each term in Equation|22]gives the relative likelihood of be- 
ing a Milky Way, Segue 1 single, and Segue 1 binary star re- 
spectively. Again, we emphasize that our full likelihood also 
includes metallicity and position information to help deter- 
mine membership of each star. To accomplish this, we mul- 
tiply each term in Equation (l22l i by the corresponding likeli- 
hoods in metallicity and position (Equation (|4]i), the form of 
which has already been discussed in Section |2] In addition 
to the dispersion and systemic velocity of the dwarf galaxy. 



there are 9 model parameters to help determine membership 
(Equation (O) and 3 binary parameters (B, fiiogp, a-jogp), for a 
total of 14 model parameters in the likelihood. 

4. VELOCITY DISPERSION OF SEGUE 1 

Using the method outlined in the previous sections, we in- 
fer a posterior probability distribution for the intrinsic ve- 
locity dispersion of Segue 1 by marginalizi ng over all the 
other p arameters via a nested sampling routine (ISkillingl2004t 
iFeroz e t al. 2009). In the left panel of Figure |5l the inferred 
probability distribution of the galaxy's velocity dispersion is 
plotted with (solid) and without (triple-dot-dashed) correct- 
ing for binaries. We see that correcting for binaries lowers 
the inferred dispersion slightly and gives rise to a small but 
non-zero probability of an intrinsic dispersion smaller than 
1 .5 km s"' . Using our full sample, the binary-corrected veloc- 
ity dispersion is 3. 7;^ [ 'J at Icr. We find a ~ 3.5% probability 
of a dispersion smaller than 1.5 km s"', and ~ 1.7% for dis- 
persions < 1 km s Although the low-dispersion tail in the 
probability distribution is small, it does extend all the way to 
zero velocity dispersion. As we will show in Section |5] this 
is due mainly to the possibility of binary populations with a 
high binary frac tion and a mean period shorter than 10 years 
(see Figure [8(a)] i. 

To test how sensitive our results are to individual veloc- 
ity outlier stars, the right panel of Figure |5] plots the in- 
ferred dispersion if the star SDSSJ100704.35-I-160459.4 is 
excluded from the sample. This star is a 6cr velocity out- 
lier that nevertheless has a substantial membership probabil- 
ity Hp) = 0.49) due to its proximity to the projected center 
of Segue 1 . The inferred maximum likelihood dispersion us- 
ing the expectation maximization algorithm of IWalker et alj 
(I2009bt which is not corrected for binaries) decreases from 
5.5 km s-i to 3.9 km s"' when SDSSJ100704.35-I-160459.4 
is removed from the sample (Paper 1). We find that exclud- 
ing SDSSJ100704.35+160459.4 does not have a significant 
effect on the general properties of the dispersion probability 
distribution — the spread, » 4 km s"' peak, and low-velocity 
tail features are largely unaffected. This is partly because its 
membership is treated in a statistical sense, and also because 
if the star is a member of Segue 1, the implied probability 
of being a binary is quite high ((/?/,) = 0.89). On the other 
hand, excluding the giants from the sample does bias the re- 
sult to higher dispersion values. This is primarily due to the 
smaller measurement errors in the red giant population which 
give them a high relative weight in determining the velocity 
dispersion despite their small numbers (six giant branch stars 
in total). 

We have also investigated the intrinsic dispersion of these 
giant branch stars. Using the same method outlined above but 
removing the main-sequence stars that are identified as Segue 
1 members in Paper 1, we find that the intrinsic dispersion 
is 2.0;^j ' km s"'. This is consistent with the dispersion ob- 
tained from the full sample. The large error bars are due to 
small number of Segue 1 members as compared to the Milky 
Way stars in the sample. Using the less statistically rigorous 
method of assuming these six giant branch stars are definite 
members and not including any other stars (or a second Milky 
Way distribution in the likelihood), we obtain a dispersion of 
1.7^[^ km s"'. While it is in principle possible that the giants 
and the main sequence stars could trace two kinematically dis- 
tinct populations, this scenario is physically very unlikely be- 
cause the ages and masses of the giants should be effectively 
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identical to those of their less-evolved counterparts. However, 
the small number of stars on the RGB precludes us from con- 
clusively ruling out such an occurrence. 

One particularly robust constraint from our analysis con- 
cerns the stellar number density profile. Figure|6]shows the in- 
ferred probability density of the projected (two-dimensional) 
radius that contains half of the member stars of Segue 1 (R\/2)- 
Three of the probability densities plotted use our standard 
conditional likelihood (L(v, w\R), Equation ^) and explore 
how our derived R1/2 constraints depend on the assumed stel- 
lar density profile shape. The solid, dash-dotted, and dotted 
lines, respectively, are derived using a Plummer model, a Ser- 
sic model, and a modified Plummer model (see Equation ^) 
where the outer slope (a) is marginalized from 3 to 10. Re- 
gardless of the assumed stellar density profile, ^1/2 is typically 
constrained to be 30-50 pc. The triple-dot-dashed line shows 
the probability density that results when we include the posi- 
tion information directly in the likelihood using Eq. |2] In this 
case, the constraints on 7? 1/2 get even tighter, with 7? 1/2 - 28^4 
pc, which is in good agree ment with the best photometric de- 
termination of 7? 1/2 - 29^j (iMartin et al. 2008, 1 cr range indi- 
cated with vertical dotted lines). We emphasize that we have 
not used any prior on the light distribution of Segue 1 from 
photometry in our analysis. The R1/2 determinations shown 
in Figure |6] are derived entirely from our complete kinematic 
sample. 

5. BINARY POPULATION OF SEGUE 1 

We showed in the previous section that the binary correc- 
tion to the velocity dispersion of Segue 1 is likely to be small, 
in spite of the large velocity variations observed for a few of 
the stars. In this section, we investigate the corresponding 
constraints on the binary population of Segue 1 obtained by 
our method. By marginalizing over all other parameters, we 
infer probability distributions in the binary fraction B, mean 
log-period i-i\Qgp, and width of the period distribution cr\ogp- 
We find that while the binary fraction and width of the period 
distribution are poorly constrained, the mean period is much 
better constrained. In Figure |7] we plot the posterior prob- 
ability distribution of the mean log-period fiiosp. The most 
probable infeiTed mean period is s; 10 years. This is signifi- 
cantly shorter than the 180 year mean period of solar neigh- 
borhood field binaries, although a 180 year mean period is 
still allowed at the Icr level. As shown by the dotted line 
in Figure |7] if the giants are excluded from the sample, the 
mean period is longer. Given the small sample of RGB stars, 
this is likely because excluding the giants removes the one 
star that shows strong evidence of binary orbital motions with 
a period of ~1 yr. To see how the inferred intrinsic disper- 
sion is affected by the mean period, in Figure |8la) we plot 
the joint probability distribution of mean period and intrinsic 
dispersion. If the inferred dispersion is larger than 1 km s 
the probability distribution of the mean period follows that 
derived in the full sample (see Figure |7]l. By contrast, the re- 
gion where the inferred dispersion is smaller than 1 km s"' 
is dominated by mean periods of ~ 2 years — if the mean pe- 
riod were longer, binaries could not possibly account for the 
observed « 4 km s ' dispersion (see the dashed line in Figure 
|7]i. However, the bulk of the probability is in the region of cr > 
1 km s ', even though the mean period appears to be shorter 
than that of solar neighborhood field binaries. 

While the width of the peri od dis tribution criogp is poorly 
constrained on its own. Figure [8(b)| shows there is a correla- 
tion between the inferred mean period //log p and criog p which 



is particularly noticeable for mean periods ^ 100 years. This 
reflects the fact that for long mean periods, the period distribu- 
tion must be sufficiently wide to include periods short enough 
to account for the observed velocity variations. More striking 
is the correlation between the inferred binary fraction and the 
mean period, shown in Figure |8(d)| This relation shows, for 
example, that if the mean period were shorter than ^ 1 month, 
the binary fraction would have to be smaller than x 0.3; oth- 
erwise, the binaries would generate a large non-Gaussian tail 
in the velocity distribution that is inconsistent with the data. 
This constraint is important in that it places an effective up- 
per bound on the probability that the observed dispersion of 
Segue 1 is entirely due to binaries, a bound which is almost 
prior-independent. To show this, in the left panel of Figure |9] 
we plot the probability distribution of the mean period using 
three different priors on the mean period: the Milky Way com- 
posite prior discussed in Section |372l a flat prior with a mini- 
mum mean period of one week, and an exponential prior that 
is strongly biased to short mean periods. We find the most 
probable inferred mean period is only slightly different be- 
tween the flat prior and Milky Way composite prior, while the 
exponential prior produces very short mean periods of less 
than one year. However, as we can see in the right panel of 
Figure |9] the effect on the infeiTed dispersion is minor. Even 
though the exponential prior is biased to very short mean peri- 
ods, the binary fraction is constrained to be smaller than 0.2 in 
that case, which limits the effect that such an extreme binary 
population can have on the observed dispersion. In essence, 
if a large fraction of the stars were short period binaries, then 
the data would have revealed them. The data did not, which 
forces the binary fraction to be small, thus limiting the effect 
on the dispersion. 

6. CONTAMINATION BY A SPATIALLY OVERLAPPING 
TIDAL STREAM? 

We can see from Figure [T] that it is unlikely cuiTent data 
can pick out a third population in addition to the Segue 1 
and Milky Way populations. To test this, we considered two 
cases. For the first case, we assumed that the third popula- 
tion is spatially uniform, consistent with an overlapping tidal 
stream (e.g., Sagittarius stream, see iNiederste-Ostholt et al.l 
2009). The velocity and metallicity distributions of the stream 
were varied independent of the Segue 1 and Milky Way dis- 
tributions. For the second case, we again assumed the third 
population has its own independent velocity and metallicity 
distributions, but has a spatial distribution given by a modi- 
fied Plummer profile whose Plummer radius and outer slope 
are allowed to vary independently from that of the primary 
Segue 1 population. The second case, therefore, is a test to see 
if there is any evidence for two stellar populations in Segue 1 . 
In both cases, we assumed that the binary period, ellipticity 
and mass ratio distributions for the third population were the 
same as for the primary Segue 1 population. Additionally, the 
priors for parameters defining the third population are set to 
be the same as that for Segue 1 (which are listed in Table |2]i. 
Our results show that with the cuiTent data set, the third pop- 
ulation is unconstrained. This implies that there is not a third 
population that is significantly offset from Segue 1 in its spa- 
tial and velocity distribution to be detected in this data set. 
Turning this argument around and discussing how different a 
third population has to be to be detected in this data set is an 
involved question that takes us well beyond the aims of the 
present paper. We do, however, show in Figure [TO] that the 
extra degrees of freedom in terms of the third population does 
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Fig. 5. — Inferred probability density of tlie velocity dispersion of Segue 1. Left: comparing the probability density with (solid black line) and without (triple- 
dot-dashed blue line) the correction due to binary motion, we see that correcting for binaries results in a lower inferred dispersion and gives rise to a tail at low 
velocities, due mainly to short-period binaries (Section|5). Right: note that excluding the star SDSSJ100704. 35+160459. 4, which is a 6<t velocity outlier with 
a substantial membership probabihty, does not have a significant impact on the inferred dispersion (dash-dotted cyan fine) since its possible membership and 
binarity are treated statistically (see paragraph 2 of Section]?). Exclusion of the red giants biases the probability distribution (dashed red line) to higher dispersion 
values; this is primarily due to their smaller measurement errors which give them a large relative weight in determining the velocity dispersion despite the small 
number of probable members (six RGB stars in total). 
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Fig. 6. — Probability density of the projected (two-dimensional) radius con- 
taining half the member stars of Segue 1 . The wider distributions were com- 
puted using our conditional likelihood (Equation 0) assuming a Plummer 
model (solid black), a modified Plummer model (dash-dotted green), and a 
Sersic model (dashed red) for the stellar density profile. The triple-dot-dashed 
blue fine shows that the probabihty density is further constrained when we in- 
clude the full position information in the likelihood (L(v, w, r)). The vertical 
fines bracket the 68% confidence region of the best photometric determina- 
tion of Ri/2 ^Martin et al..200&) . 
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Fig. 7. — Probability density of the mean log-period of Segue I's binary pop- 
ulation (solid curve). For comparison we plot our fiducial prior on the mean 
period (triple-dot dashed curve), which is determined by the requirement that 
a large number of binaiy populations drawn from this prior superimpose to 
form the log-normal period distribution of field binaries observed in the so- 
lar neighborhood, which have a mean period of 180 years. We see that the 
data suggest that the mean period of Segue 1 may be significantly shorter 
than that of field binaiies, with a most probable infen'ed mean period of a; 
10 years. The dashed (dot-dashed) curve is the distribution in the parameter 
space where the inferred dispersion is smaller (larger) than 1 km s" ' . 



not significantly affect the inferred probability distribution of 
the intrinsic dispersion of Segue 1 . 

These tests reinforce the analysis in Paper 1 where we fo- 
cused specifically on the issue of contamination by Sagittar- 
ius stream stars. The results of a Monte Carlo analysis there 
showed that it was unlikely that the measured dispersion for 
Segue 1 was due to a small number of stream stars contam- 
inating the sample. The analysis described in this section 
echoes that result and shows in a rigorous statistical man- 
ner that there is no evidence for a third population given the 
present sample of velocities. 



7. CONCLUSIONS 

We have introduced a comprehensive Bayesian method to 
analyze multi-epoch velocity measurements of Milky Way 
satellites that incorporates uncertainties due to imperfect 
knowledge of membership and binary orbital motion of stars. 
We applied this method to Segue 1 using the kinematic data 
set described in Paper 1, which includes 181 candidate mem- 
ber stars, 67 of which have repeat measurements. We model 
the likelihoods of relevant populations (Milky Way, Segue 1, 
and possibly an overlapping stream) and thereby incorporate 
membership probabilities implicitly in the method. Our re- 
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Fig. 8. — Joint posterior probability distributions of: (a) intrinsic dispersion vs. mean log-period of Segue 1 binary population, (b) width of period distribution 
vs. mean log-period, (c) mean Ca width of the member stars vs. mean log-period, and (d) binary fraction vs. mean log-period. Inner and outer contours sun'ound 
the region containing 68% and 95% of the total probability, respectively. 
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Fig. 9. — Plotted are the probability densities of the mean orbital period and the corresponding dispersion assuming various priors on the binary distribution 
parameters. Although the period distribution is not well constrained by data, we find that the dispersion probability density is surprisingly robust to the shape 
of the period distribution. Here, we compare our fiducial Milky Way composite prior on the binary period distribution to priors that preferentially select short 
period binary solutions. Solutions whose priors prefer short periods (e.g., flat (red dashed line) and exponential (blue dot-dashed Une) mean period priors) have 
dispersion probability densities that agree remarkably well. This is true even when both the mean period and the width of the period distribution are biased low 
(green dotted line). 
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Fig. 10. — Plotted curves are the probability density of the dispersion as- 
suming the fiducial model with a Milky Way and Segue 1 stellar populations 
(solid black), assuming in addition a third stream-like population (dashed 
red), or assuming a third Segue 1-like population (dash-dotted green). Re- 
assuringly, we find that the extra degrees of freedom in terms of the third 
population did not significantly affect the posterior for the intrinsic disper- 
sion. 

suits support the interpretation that Segue 1 is a dark-matter- 
dominated galaxy with an intrinsic velocity dispersion of 
3.7^[-| km s"' at Icr. We stress here that the multi-epoch data 
analysis is critical — with just the average velocity for each 
star, the possibility that most of the observed dispersion of 
Segue 1 is due binary orbital motion cannot be disfavored. 

Our method produces a posterior for the membership prob- 
ability of each star and simultaneously constrains the ra- 
dial distribution of Segue 1 member stars without appeal- 
ing to separate photometry. Using the full likelihood (see 
Equation (|2]i and Table |2|, we find /?i/2 - 28^^, which is 
in excellent agreem ent with past photometric measurements 
dMartin et alj|2008l) . We also included the slope of the stellar 
profile (see Equation (|6]l) in our full likelihood analysis and 



find that a = 4.1 



+2.0 



which is consistent with the standard 



Plummer profile {a - 5). 

To include the possibility that each star is in a binary sys- 
tem, we modified the velocity likelihood of Segue 1 to take 
into account changes in the velocity distribution resulting 
from binary orbital motion. The binary properties of the 
Segue 1 ensemble were parameterized by a log-normal distri- 
bution in period and a total binary fraction. Only the mean pe- 
riod was marginally constrained, with a most probable mean 
period of 10 years, much smaller than the 180 year mean pe- 
riod for binary stars in the solar neighborhood. However, our 
results are still consistent with a mean period of 180 years at 
about Icr. 

We also found a slight degeneracy between the binary frac- 
tion and mean period with the binary fraction decreasing with 
the mean period. The case where a large fraction of Segue 1 
stars are short period binaries (P < 1 yr) is disfavored by the 
lack of large velocity variations (relative to the errors) in the 
repeat measurements. One implication is that our inferred in- 
trinsic velocity dispersion is robust to the period distribution. 
We explicitly tested this by varying the priors on the binary 
parameters and found no significant affect on the inferred in- 
trinsic dispersion probability distribution. The inferred intrin- 
sic dispersion was also insensitive to the inclusion or exclu- 
sion of velocity outliers. 

Our results show that the velocity dispersion of Segue 1 is 
larger than 1 km s ' at the 98.3% confidence level. The small 
probability of dispersions lower than 1 km s"' is caused by 
the possibility of binary stars with short periods inflating the 
velocity dispersion. Note that with a 1 km s ' velocity dis- 
persion. Segue 1 would have (M1/2/L1/2) - 150 within the 
half-light radius (if interpreted as a system in equilibrium) 
and therefore would still be among th e most dark-matter- 
dominated satell ites of the Milky Way (I Walker et al.ll2009al: 
I Wolf et aTl 120101) . An alternative interpretation at this confi- 
dence level would be that Segue 1 is a star cluster that is dis- 
rupting and hence had its intrinsic velocity dispersion inflated 
to about 1 km s"', and with parameters such that binary or- 
bital motion contributes an additional ~3 km s ' dispersion. 
Beyond the low probability we determined, this possibility 
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seems unlikely on other grounds as well. The Jacobi radius at 
a distance of 23 kpc with (M/L)steiiai- ~ 10 (an extreme value) 
is about 30 pc — smaller than the region covered by our Segue 
1 sample of stars (about 70 pc). Hence, we expect to see tidal 
features ("P enarrubia et al.ll2009l) and none could be identified 
(see Paper I for more details). 

From the inferred probability distribution for the intrinsic 
dispersion, we find that there is only about 0.4% probability 
that the intrinsic dispersion is smaller than about 0.3 km s"'. 
Interpreted as an equilibrium system, such a dispersion would 
imply (Mi/2/Li/2)steiiai- < 10. At this confidence level, there- 
fore, the stellar velocity data allow for the possibility that 
Segue 1 is a star cluster, albeit with a rather extreme stellar 
population, and with the measured velocity dispersion domi- 
nated by the orbital motion of binary stars with mean periods 
of around a year (see Figure |7]i. However, given the tidal ar- 
guments above, it is unclear how we may think of this system 
as being in equilibrium when it is not dark matter dominated. 
In addition, the large measured metallicity spread is also not 
consistent with the star cluster hypothesis (Paper 1). 

The most likely interpretation of our results is that Segue 1 
is a dark matter dominated galaxy. In this case, our inferred 
velocity dispersion implies a mass 

within a sphere that encloses half the galaxy's stellar lumi- 
nosity, which from our full likelihood analysis is ri/2 = 36^^ 

pc. T o calculate t his ma ss, we have used M1/2 = 3G 'ri/2cr^ 
from IWolf et al.l ( 1201 0) along with the distribution for ri/2 
and cr that we have derived using the full likelihood. The 
average density of dark matter within this radius is therefore 
pi/2 - 2.5;|;| gMopc"^, which is the highest density of dark 
matter yet measured in an y Local Group object ( Wo lf et all 
120 lot IWalker et al.ll2009a) . We note here that a flat prior in 
the intrinsic dispersion has been used throughout the paper. If 
a flat prior in M1/2 is imposed, the resultant confidence inter- 
val changes sUghtly to M1/2 = 5.2+3 ^ x IO^Mq. 

It is worth emphasizing that this dark matter density is 
among the highest dark matter densities that is known defini- 
tively in any galaxy. Some of the larger elliptical galax- 
ies are measured to have comparably high mass densities at 
their half-light radii (~ IMopc"''), but inferring similarly high 



dark matter density is complicated by the fact that galax- 
ies of this type are baryon-dominated in th eir centers (e.g. 
iCapp ellari et aH 120061 iTollerud et all lIoToh . For rotation- 
supported galaxies, the highest dark matter density would be 
obtained at the innermost point where rotation velocity is reli- 
ably measured. For e xample, some of the n earby low surface 
brightness galaxies in lKuzio de Narav et al.l (12008) show rota- 
tion speeds of 10^0 km s"' at about 300 pc, or average dark 
matter density in the range 0.05-1 M^/pc^. For reference if 
we assume a typical NFW profile for the Milky Way, then the 
Milky Way will have this density in dark matter at a radius of 
about 100 pc (assuming cold dark matter model). At the other 
end of the mass range for a halo with V^ax less than 10 km s 
these densities will occur at radii smaller than the half light ra- 
dius of Segue 1 (assuming cold dark matter), suggesting that 
Segue 1 has a more massive halo. A third way of inferring 
the total mass in a galaxy is through strong lensing, which 
provides a measurement of the surface mass density. In order 
for strong lensing to occur, the surface mass density has to be 
larger than a critical value. Taking the angular diameter dis- 
tance to the lens, D - I Gpc, assuming that the lens is halfway 
between the source and observer, and using ~ arcsec for the 
angular size of a typical Einstein ring (Bolton etal. 2008|) 
yields a characteristic total mass density (not all dark matter) 
within the Einstein ring of cV(27rGD-arcsec) ~ IMo/pc^ 

If we assume that all the dwarf sphero idal satellite galaxie s 
of the Milky Way inhabit similar halos (Strigariet 



e galaxie s 
aLl l2008h . 



this high density implies that Segue 1 has the highest phase 
space density among all dwarfs and hence should provide the 
best constraints on thermal and non-thermal warm dark mat- 
ter The large determi ned dark matter halo mass also validates 
previous expectations (iGeha et al.l2009HMartinez et al.l2009l) 
that Segue 1 is an excellent target for the indirect detection of 
dark matter and a useful laboratory for studying galaxy for- 
mation at the extreme faint end of the luminosity function. 
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